-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Added Query Response Headers #44593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Added Query Response Headers #44593
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds the ability to retrieve response headers from query_items() operations in the Azure Cosmos DB Python SDK. This addresses customer feedback requesting access to important headers like x-ms-request-charge and x-ms-activity-id during query pagination.
Key Changes:
- Introduced CosmosItemPaged and CosmosAsyncItemPaged wrapper classes that extend ItemPaged/AsyncItemPaged to expose response header retrieval methods
- Added header tracking to QueryIterable (sync and async) via _response_headers list that captures headers on each page fetch
- Updated query_items() return types across container.py and _container.py to use the new wrapper classes
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| azure/cosmos/_cosmos_responses.py | Adds CosmosItemPaged and CosmosAsyncItemPaged classes with get_response_headers() and get_last_response_headers() methods |
| azure/cosmos/_query_iterable.py | Adds header capture logic and response header retrieval methods to QueryIterable |
| azure/cosmos/aio/_query_iterable_async.py | Adds header capture logic and response header retrieval methods to async QueryIterable |
| azure/cosmos/_cosmos_client_connection.py | Updates QueryItems() to return CosmosItemPaged instead of ItemPaged |
| azure/cosmos/aio/_cosmos_client_connection_async.py | Updates QueryItems() to return CosmosAsyncItemPaged instead of AsyncItemPaged |
| azure/cosmos/container.py | Updates query_items() return type annotations from ItemPaged to CosmosItemPaged |
| azure/cosmos/aio/_container.py | Updates query_items() return type annotations from AsyncItemPaged to CosmosAsyncItemPaged |
| tests/test_query_response_headers.py | Comprehensive sync tests for the new header retrieval functionality |
| tests/test_query_response_headers_async.py | Comprehensive async tests for the new header retrieval functionality |
| tests/test_config.py | Contains commented-out imports and credential changes that appear to be debugging/development artifacts |
Comments suppressed due to low confidence (2)
sdk/cosmos/azure-cosmos/azure/cosmos/container.py:805
- The docstring for query_items should document the new response header retrieval capability. Users need to know that the returned CosmosItemPaged object provides get_response_headers() and get_last_response_headers() methods to access response headers from query operations. Consider adding a note in the returns section or in the main description about this feature.
"""Return all results matching the given `query`.
You can use any value for the container name in the FROM clause, but
often the container name is used. In the examples below, the container
name is "products," and is aliased as "p" for easier referencing in
the WHERE clause.
:param str query: The Azure Cosmos DB SQL query to execute.
:param parameters: Optional array of parameters to the query.
Each parameter is a dict() with 'name' and 'value' keys.
Ignored if no query is provided.
:type parameters: [list[dict[str, object]]]
:param partition_key: Partition key at which the query request is targeted. If the partition key is set to
None, it will perform a cross partition query. To learn more about using partition keys, see `here
<https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/cosmos/azure-cosmos/docs/PartitionKeys.md>`_.
:type partition_key: ~azure.cosmos.partition_key.PartitionKeyType
:param bool enable_cross_partition_query: Allows sending of more than one request to
execute the query in the Azure Cosmos DB service.
More than one request is necessary if the query is not scoped to single partition key value.
:param int max_item_count: Max number of items to be returned in the enumeration operation.
:param bool enable_scan_in_query: Allow scan on the queries which couldn't be served as
indexing was opted out on the requested paths.
:param bool populate_query_metrics: Enable returning query metrics in response headers.
:keyword bool populate_index_metrics: Used to obtain the index metrics to understand how the query engine used
existing indexes and how it could use potential new indexes. Please note that this option will incur
overhead, so it should be enabled only when debugging slow queries.
:keyword int continuation_token_limit: The size limit in kb of the response continuation token in the query
response. Valid values are positive integers.
A value of 0 is the same as not passing a value (default no limit).
:keyword Sequence[str] excluded_locations: Excluded locations to be skipped from preferred locations. The locations
in this list are specified as the names of the Azure Cosmos locations like, 'West US', 'East US' and so on.
If all preferred locations were excluded, primary/hub location will be used.
This excluded_location will override existing excluded_locations in client level.
:keyword dict[str, str] initial_headers: Initial headers to be sent as part of the request.
:keyword int max_integrated_cache_staleness_in_ms: The max cache staleness for the integrated cache in
milliseconds. For accounts configured to use the integrated cache, using Session or Eventual consistency,
responses are guaranteed to be no staler than this value.
:keyword Literal["High", "Low"] priority: Priority based execution allows users to set a priority for each
request. Once the user has reached their provisioned throughput, low priority requests are throttled
before high priority requests start getting throttled. Feature must first be enabled at the account level.
:keyword response_hook: A callable invoked with the response metadata.
:paramtype response_hook: Callable[[Mapping[str, str], dict[str, Any]], None]
:keyword str session_token: Token for use with Session consistency.
:keyword int throughput_bucket: The desired throughput bucket for the client.
:keyword dict[str, Any] availability_strategy_config:
The threshold-based availability strategy to use for this request.
If not provided, the client's default strategy will be used.
:returns: An Iterable of items (dicts).
:rtype: CosmosItemPaged
.. admonition:: Example:
.. literalinclude:: ../samples/examples.py
:start-after: [START query_items]
:end-before: [END query_items]
:language: python
:dedent: 0
:caption: Get all products that have not been discontinued:
.. literalinclude:: ../samples/examples.py
:start-after: [START query_items_param]
:end-before: [END query_items_param]
:language: python
:dedent: 0
:caption: Parameterized query to get all products that have been discontinued:
"""
sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py:614
- The docstring for query_items should document the new response header retrieval capability. Users need to know that the returned CosmosAsyncItemPaged object provides get_response_headers() and get_last_response_headers() methods to access response headers from query operations. Consider adding a note in the returns section or in the main description about this feature.
"""Return all results matching the given `query`.
You can use any value for the container name in the FROM clause, but
often the container name is used. In the examples below, the container
name is "products," and is aliased as "p" for easier referencing in
the WHERE clause.
:param str query: The Azure Cosmos DB SQL query to execute.
:keyword int continuation_token_limit: The size limit in kb of the response continuation token in the query
response. Valid values are positive integers.
A value of 0 is the same as not passing a value (default no limit).
:keyword bool enable_scan_in_query: Allow scan on the queries which couldn't be served as
indexing was opted out on the requested paths.
:keyword Sequence[str] excluded_locations: Excluded locations to be skipped from preferred locations. The locations
in this list are specified as the names of the Azure Cosmos locations like, 'West US', 'East US' and so on.
If all preferred locations were excluded, primary/hub location will be used.
This excluded_location will override existing excluded_locations in client level.
:keyword dict[str, str] initial_headers: Initial headers to be sent as part of the request.
:keyword int max_integrated_cache_staleness_in_ms: The max cache staleness for the integrated cache in
milliseconds. For accounts configured to use the integrated cache, using Session or Eventual consistency,
responses are guaranteed to be no staler than this value.
:keyword int max_item_count: Max number of items to be returned in the enumeration operation.
:keyword parameters: Optional array of parameters to the query.
Each parameter is a dict() with 'name' and 'value' keys.
Ignored if no query is provided.
:paramtype parameters: [list[dict[str, object]]]
:keyword partition_key: Partition key at which the query request is targeted. If the partition key is set to
None, it will perform a cross partition query. To learn more about using partition keys, see `here
<https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/cosmos/azure-cosmos/docs/PartitionKeys.md>`_.
:paramtype partition_key: ~azure.cosmos.partition_key.PartitionKeyType
:keyword bool populate_index_metrics: Used to obtain the index metrics to understand how the query engine used
existing indexes and how it could use potential new indexes. Please note that this option will incur
overhead, so it should be enabled only when debugging slow queries.
:keyword bool populate_query_metrics: Enable returning query metrics in response headers.
:keyword Literal["High", "Low"] priority: Priority based execution allows users to set a priority for each
request. Once the user has reached their provisioned throughput, low priority requests are throttled
before high priority requests start getting throttled. Feature must first be enabled at the account level.
:keyword response_hook: A callable invoked with the response metadata.
:paramtype response_hook: Callable[[Mapping[str, str], dict[str, Any]], None]
:keyword str session_token: Token for use with Session consistency.
:keyword int throughput_bucket: The desired throughput bucket for the client.
:keyword dict[str, Any] availability_strategy_config:
The threshold-based availability strategy to use for this request.
If not provided, the client's default strategy will be used.
:returns: An Iterable of items (dicts).
:rtype: CosmosAsyncItemPaged
.. admonition:: Example:
.. literalinclude:: ../samples/examples_async.py
:start-after: [START query_items]
:end-before: [END query_items]
:language: python
:dedent: 0
:caption: Get all products that have not been discontinued:
.. literalinclude:: ../samples/examples_async.py
:start-after: [START query_items_param]
:end-before: [END query_items_param]
:language: python
:dedent: 0
:caption: Parameterized query to get all products that have been discontinued:
"""
sdk/cosmos/azure-cosmos/azure/cosmos/aio/_query_iterable_async.py
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/tests/test_query_response_headers_async.py
Outdated
Show resolved
Hide resolved
API Change CheckAPIView identified API level changes in this PR and created the following API reviews |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| first_page_headers = response_headers[0] | ||
| assert "x-ms-request-charge" in first_page_headers | ||
| assert "x-ms-activity-id" in first_page_headers | ||
|
|
||
| # Verify get_last_response_headers works | ||
| last_headers = query_iterable.get_last_response_headers() | ||
| assert last_headers is not None | ||
| assert "x-ms-request-charge" in last_headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would love an assertion on the request charge or activity id or something to show that the first page saved is also the same first page saved in the total response_headers
simorenoh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good, just a couple things - thanks!
| """ | ||
| if self._response_headers: | ||
| return self._response_headers[-1].copy() | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure that this behavior matches the behavior of the other header method - if there's nothing, we can just return an empty list, having one return None and the other return an empty list seems weird
| def _capture_response_headers(self) -> None: | ||
| """Capture response headers from the last request.""" | ||
| if self._client.last_response_headers: | ||
| headers = self._client.last_response_headers.copy() | ||
| self._response_headers.append(headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not thread-safe - since the last_response_headers in the client are populated by any request that is executed, there is no direct guarantee that you will be able to get the relevant requests' headers. We probably need to hook this back to wherever the actual query response saves into the last_response_headers and keep the original ones for the new CosmosPaged objects
| from .._base import (_build_properties_cache, _deserialize_throughput, _replace_throughput, | ||
| build_options as _build_options, GenerateGuidId, validate_cache_staleness_value) | ||
| from .._change_feed.feed_range_internal import FeedRangeInternalEpk | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changelog entry ?
Description
This PR adds the ability to retrieve response headers from query_items() operations, addressing customer feedback requesting access to headers like x-ms-request-charge and x-ms-activity-id during query pagination.
Changes
Both classes provide:
Updated Files
Tests Added